The IRST English-Spanish translation system for european parliament speeches
نویسندگان
چکیده
This paper presents the spoken language translation system developed at FBK-irst during the TC-STAR project. The system integrates automatic speech recognition with machine translation through the use of confusion networks, which permit to represent a huge number of transcription hypotheses generated by the speech recognizer. Confusion networks are efficiently decoded by a statistical machine translation system which computes the most probable translation in the target language. This paper presents the whole architecture developed for the translation of political speeches held at the European Parliament, from English to Spanish and vice versa, and at the Spanish Parliament, from Spanish to English.
منابع مشابه
The 2006 RWTH parliamentary speeches transcription system
In this work, investigations in the course of the developement of RWTH automatic speech recognition systems developed for the second TC-STAR evaluation campaign 2006 are presented. The systems were designed to transcribe parliamentary speeches taken from the European Parliament Plenary Sessions (EPPS) in European English and Spanish, as well as speeches from the Spanish Parliament. The RWTH sys...
متن کاملECPC: el discurso parlamentario europeo desde la perspectiva de los estudios traductológicos de corpus
This paper presents the main outcome of the ECPC research group: an archive of European parliamentary speeches created to study this genre and the hypothetical influence of translation in the construction of European identity. The archive is made up of, on the one hand, a parallel corpus containing the English and Spanish versions of the European Parliament proceedings, and on the other hand, t...
متن کاملModelo estocástico de traducción basado en N-gramas de tuplas bilinges y combinación log-lineal de características
This communication introduces a stochastic machine translation system based on Ngram modelling of the joint probability of bilingual texts. The basic unit of this model is called a tuple and consists of a pair of both source (to be translated) language and target language (translation) word-strings. Translation is driven by a log-linear combination of the N-gram model probability and other feat...
متن کاملRecent Advances in Spoken Language Translation
The talk is structured in three parts. The first part overviews problems and approaches to spoken language translation. The second part presents challenges and achievements of the European Project TC-STAR, that ended in 2007. The third part describes advances in the use of confusion networks as interface between automatic speech recognition and machine translation. In particular, I will discuss...
متن کاملThe ISL Statistical Machine Translation System for the TC-STAR Spring 2006 Evaluation
In this paper we describe the ISL statistical machine translation system used in the TC-STAR Spring 2006 Evaluation campaign. This system is based on PESA phrase-to-phrase translations which are extracted from a bilingual corpus. The translation model, language model and other features are combined in a log-linear model during decoding. We participated in the Spanish Parliament (Cortes) and Eur...
متن کامل